Skip to content

Conversation

@waleedlatif1
Copy link
Collaborator

Summary

  • remove wrapped output from mistral parse for kb parsing pdfs

Type of Change

  • Bug fix

Testing

Tested manually

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel
Copy link

vercel bot commented Dec 12, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Preview Comments Updated (UTC)
docs Skipped Skipped Dec 12, 2025 2:02am

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Dec 12, 2025

Greptile Overview

Greptile Summary

Fixed Mistral OCR parser to correctly handle wrapped API responses when processing PDFs for knowledge base uploads. The API route wraps Mistral's response in {success: true, output: mistralData}, but the parser was treating this as if the pages array was at the root level, causing PDF parsing to fail.

Key Changes:

  • Added conditional logic to unwrap ocrResult.output when it exists and pages is not at root level
  • Replaced all references to ocrResult with mistralData to use the unwrapped data consistently
  • Removed unnecessary comments to improve code readability

The fix ensures PDF parsing works correctly for knowledge base document processing.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • The fix correctly addresses a specific bug where the API response wrapping was not being handled. The logic is sound: it checks if ocrResult.output exists and if pages is not at root level, then unwraps to ocrResult.output. All subsequent references consistently use mistralData instead of ocrResult. The change is surgical and doesn't affect any other functionality. No new dependencies, security issues, or breaking changes introduced.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
apps/sim/tools/mistral/parser.ts 5/5 Fixed wrapped output handling by conditionally unwrapping API response when ocrResult.output exists without pages at root level

Sequence Diagram

sequenceDiagram
    participant Client as Tool Client
    participant Parser as mistralParserTool
    participant API as /api/tools/mistral/parse
    participant Mistral as Mistral OCR API
    
    Client->>Parser: Call with filePath & apiKey
    Parser->>API: POST request
    API->>API: Validate auth & file access
    API->>Mistral: POST to /v1/ocr
    Mistral-->>API: Returns {pages: [...], model: "...", ...}
    API-->>Parser: Wraps in {success: true, output: {...}}
    Parser->>Parser: Check if ocrResult.output exists && !ocrResult.pages
    alt Wrapped response (output exists, no pages at root)
        Parser->>Parser: mistralData = ocrResult.output
    else Direct response (pages at root)
        Parser->>Parser: mistralData = ocrResult
    end
    Parser->>Parser: Extract pages from mistralData.pages
    Parser->>Parser: Process markdown content
    Parser-->>Client: Return {success: true, output: {content, metadata}}
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

@waleedlatif1 waleedlatif1 merged commit 3bde9e8 into staging Dec 12, 2025
9 checks passed
@waleedlatif1 waleedlatif1 deleted the fix/mistral branch December 12, 2025 02:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants